Welcome to this notebook on Responsible AI. This project aims to touch on the most important topics in the field of Responsible AI including bias, fairness, interpretability, error analysis and counterfactuals.
We use the famous UCI Adult Census Income dataset to apply the responsible AI concepts.
# Standard Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import balanced_accuracy_score, roc_curve, auc, classification_report
# Classifiers
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.ensemble import GradientBoostingClassifier
# Interpretability Libraries
from pdpbox import pdp #, get_dataset, info_plots
import shap
from lime.lime_tabular import LimeTabularExplainer
from fairlearn.postprocessing import ThresholdOptimizer
from fairlearn.metrics import (
MetricFrame,
selection_rate,
count,
selection_rate_difference,
)
from dice_ml import Data, Model, Dice
# For plotting matplotlib plots in jupyter notebook
%matplotlib inline
Import Data
# Using the same feature names as in Lab 1 (more understandable than the default ones)
names = ['age','workclass','fnlwgt','education','education-num',
'marital-status','occupation','relationship','race','sex',
'capital-gain','capital-loss','hours-per-week','native-country',
'y']
df_train = pd.read_csv("data/adult.csv",names=names)
df_test = pd.read_csv("data/adult.test",names=names)
df_train_copy = df_train.copy()
df_test_copy = df_test.copy()
Preprocessing and EDA
First, we need to check for duplicated entries and remove them
print(sum(df_train.duplicated()))
print(sum(df_test.duplicated()))
24 5
df_train = df_train.drop_duplicates()
df_test = df_test.drop_duplicates()
print(df_train.shape)
print(df_test.shape)
(32537, 15) (16277, 15)
We have 32537 observations in the train set and 16277 in the test set. There are 14 feature plus 1 target variable.
The final weight fnlwgt feature is not well defined and commonly dropped from the data. (Originally it indicates the number of people the census believes the entry represents)
df_train.drop('fnlwgt', axis=1, inplace=True)
df_test.drop('fnlwgt', axis=1, inplace=True)
df_train.head()
| age | workclass | education | education-num | marital-status | occupation | relationship | race | sex | capital-gain | capital-loss | hours-per-week | native-country | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 39 | State-gov | Bachelors | 13 | Never-married | Adm-clerical | Not-in-family | White | Male | 2174 | 0 | 40 | United-States | <=50K |
| 1 | 50 | Self-emp-not-inc | Bachelors | 13 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 0 | 13 | United-States | <=50K |
| 2 | 38 | Private | HS-grad | 9 | Divorced | Handlers-cleaners | Not-in-family | White | Male | 0 | 0 | 40 | United-States | <=50K |
| 3 | 53 | Private | 11th | 7 | Married-civ-spouse | Handlers-cleaners | Husband | Black | Male | 0 | 0 | 40 | United-States | <=50K |
| 4 | 28 | Private | Bachelors | 13 | Married-civ-spouse | Prof-specialty | Wife | Black | Female | 0 | 0 | 40 | Cuba | <=50K |
The target variable y indicates whether a person's income is above or below 50k per year. The task is to precdict the income variable.
df_train.isnull().sum()
age 0 workclass 0 education 0 education-num 0 marital-status 0 occupation 0 relationship 0 race 0 sex 0 capital-gain 0 capital-loss 0 hours-per-week 0 native-country 0 y 0 dtype: int64
df_test.isnull().sum()
age 0 workclass 1 education 1 education-num 1 marital-status 1 occupation 1 relationship 1 race 1 sex 1 capital-gain 1 capital-loss 1 hours-per-week 1 native-country 1 y 1 dtype: int64
print(df_test.isnull().any(axis=1).sum())
1
No null values in the train set. There is one entry in the test set with most features being null. As this entry does not contain much information (especially the income is unknown) we can drop this entry.
df_test = df_test.dropna()
Lets perform some quick Exploratory Data Analysis
df_train.describe()
| age | education-num | capital-gain | capital-loss | hours-per-week | |
|---|---|---|---|---|---|
| count | 32537.000000 | 32537.000000 | 32537.000000 | 32537.000000 | 32537.000000 |
| mean | 38.585549 | 10.081815 | 1078.443741 | 87.368227 | 40.440329 |
| std | 13.637984 | 2.571633 | 7387.957424 | 403.101833 | 12.346889 |
| min | 17.000000 | 1.000000 | 0.000000 | 0.000000 | 1.000000 |
| 25% | 28.000000 | 9.000000 | 0.000000 | 0.000000 | 40.000000 |
| 50% | 37.000000 | 10.000000 | 0.000000 | 0.000000 | 40.000000 |
| 75% | 48.000000 | 12.000000 | 0.000000 | 0.000000 | 45.000000 |
| max | 90.000000 | 16.000000 | 99999.000000 | 4356.000000 | 99.000000 |
Main charateristics:
df_train.info()
<class 'pandas.core.frame.DataFrame'> Index: 32537 entries, 0 to 32560 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 age 32537 non-null int64 1 workclass 32537 non-null object 2 education 32537 non-null object 3 education-num 32537 non-null int64 4 marital-status 32537 non-null object 5 occupation 32537 non-null object 6 relationship 32537 non-null object 7 race 32537 non-null object 8 sex 32537 non-null object 9 capital-gain 32537 non-null int64 10 capital-loss 32537 non-null int64 11 hours-per-week 32537 non-null int64 12 native-country 32537 non-null object 13 y 32537 non-null object dtypes: int64(5), object(9) memory usage: 3.7+ MB
df_test.info()
<class 'pandas.core.frame.DataFrame'> Index: 16276 entries, 1 to 16281 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 age 16276 non-null object 1 workclass 16276 non-null object 2 education 16276 non-null object 3 education-num 16276 non-null float64 4 marital-status 16276 non-null object 5 occupation 16276 non-null object 6 relationship 16276 non-null object 7 race 16276 non-null object 8 sex 16276 non-null object 9 capital-gain 16276 non-null float64 10 capital-loss 16276 non-null float64 11 hours-per-week 16276 non-null float64 12 native-country 16276 non-null object 13 y 16276 non-null object dtypes: float64(4), object(10) memory usage: 1.9+ MB
There are 9 categorical variables that have to be encoded:
workclasseducationmarital-statusoccupationreleationshipracesexnative-countryyage is encoded as strings in the test set. We cast age to int64. Also the numeric variables in the test set are float64. We cast them to int64 such that types are consistent accross train and test data
# Numerical variables
numerics = ['age', 'education-num', 'capital-gain', 'capital-loss', 'hours-per-week']
# Casting to int64
df_test[numerics] = df_test[numerics].astype('int64')
# Categorical variables
categories = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country', 'y']
# Checking uniques values of categorical variables
for cat in categories:
list_train = np.sort(list(df_train[cat].unique()))
list_test = np.sort(list(df_test[cat].unique()))
print(cat, " train set: ", list_train)
print(cat, ": test set", list_test)
equal = np.array_equal(list_train,list_test)
print("Are both sets equal? ", equal)
if not equal:
print("Unique categories :", set(list_train) ^ set(list_test))
print('-'*50)
workclass train set: [' ?' ' Federal-gov' ' Local-gov' ' Never-worked' ' Private'
' Self-emp-inc' ' Self-emp-not-inc' ' State-gov' ' Without-pay']
workclass : test set [' ?' ' Federal-gov' ' Local-gov' ' Never-worked' ' Private'
' Self-emp-inc' ' Self-emp-not-inc' ' State-gov' ' Without-pay']
Are both sets equal? True
--------------------------------------------------
education train set: [' 10th' ' 11th' ' 12th' ' 1st-4th' ' 5th-6th' ' 7th-8th' ' 9th'
' Assoc-acdm' ' Assoc-voc' ' Bachelors' ' Doctorate' ' HS-grad'
' Masters' ' Preschool' ' Prof-school' ' Some-college']
education : test set [' 10th' ' 11th' ' 12th' ' 1st-4th' ' 5th-6th' ' 7th-8th' ' 9th'
' Assoc-acdm' ' Assoc-voc' ' Bachelors' ' Doctorate' ' HS-grad'
' Masters' ' Preschool' ' Prof-school' ' Some-college']
Are both sets equal? True
--------------------------------------------------
marital-status train set: [' Divorced' ' Married-AF-spouse' ' Married-civ-spouse'
' Married-spouse-absent' ' Never-married' ' Separated' ' Widowed']
marital-status : test set [' Divorced' ' Married-AF-spouse' ' Married-civ-spouse'
' Married-spouse-absent' ' Never-married' ' Separated' ' Widowed']
Are both sets equal? True
--------------------------------------------------
occupation train set: [' ?' ' Adm-clerical' ' Armed-Forces' ' Craft-repair' ' Exec-managerial'
' Farming-fishing' ' Handlers-cleaners' ' Machine-op-inspct'
' Other-service' ' Priv-house-serv' ' Prof-specialty' ' Protective-serv'
' Sales' ' Tech-support' ' Transport-moving']
occupation : test set [' ?' ' Adm-clerical' ' Armed-Forces' ' Craft-repair' ' Exec-managerial'
' Farming-fishing' ' Handlers-cleaners' ' Machine-op-inspct'
' Other-service' ' Priv-house-serv' ' Prof-specialty' ' Protective-serv'
' Sales' ' Tech-support' ' Transport-moving']
Are both sets equal? True
--------------------------------------------------
relationship train set: [' Husband' ' Not-in-family' ' Other-relative' ' Own-child' ' Unmarried'
' Wife']
relationship : test set [' Husband' ' Not-in-family' ' Other-relative' ' Own-child' ' Unmarried'
' Wife']
Are both sets equal? True
--------------------------------------------------
race train set: [' Amer-Indian-Eskimo' ' Asian-Pac-Islander' ' Black' ' Other' ' White']
race : test set [' Amer-Indian-Eskimo' ' Asian-Pac-Islander' ' Black' ' Other' ' White']
Are both sets equal? True
--------------------------------------------------
sex train set: [' Female' ' Male']
sex : test set [' Female' ' Male']
Are both sets equal? True
--------------------------------------------------
native-country train set: [' ?' ' Cambodia' ' Canada' ' China' ' Columbia' ' Cuba'
' Dominican-Republic' ' Ecuador' ' El-Salvador' ' England' ' France'
' Germany' ' Greece' ' Guatemala' ' Haiti' ' Holand-Netherlands'
' Honduras' ' Hong' ' Hungary' ' India' ' Iran' ' Ireland' ' Italy'
' Jamaica' ' Japan' ' Laos' ' Mexico' ' Nicaragua'
' Outlying-US(Guam-USVI-etc)' ' Peru' ' Philippines' ' Poland'
' Portugal' ' Puerto-Rico' ' Scotland' ' South' ' Taiwan' ' Thailand'
' Trinadad&Tobago' ' United-States' ' Vietnam' ' Yugoslavia']
native-country : test set [' ?' ' Cambodia' ' Canada' ' China' ' Columbia' ' Cuba'
' Dominican-Republic' ' Ecuador' ' El-Salvador' ' England' ' France'
' Germany' ' Greece' ' Guatemala' ' Haiti' ' Honduras' ' Hong' ' Hungary'
' India' ' Iran' ' Ireland' ' Italy' ' Jamaica' ' Japan' ' Laos'
' Mexico' ' Nicaragua' ' Outlying-US(Guam-USVI-etc)' ' Peru'
' Philippines' ' Poland' ' Portugal' ' Puerto-Rico' ' Scotland' ' South'
' Taiwan' ' Thailand' ' Trinadad&Tobago' ' United-States' ' Vietnam'
' Yugoslavia']
Are both sets equal? False
Unique categories : {' Holand-Netherlands'}
--------------------------------------------------
y train set: [' <=50K' ' >50K']
y : test set [' <=50K.' ' >50K.']
Are both sets equal? False
Unique categories : {' <=50K.', ' >50K.', ' <=50K', ' >50K'}
--------------------------------------------------
Looking at the unique values of the categorical variables in the train and test set we can make four observations.
Observation 1: workclass, occupation and native-conutry contain ' ?' values which indicate missing values. They will be encoded as null values and then imputed by the most frequent occuring value (mode)
df_train[df_train== ' ?'] = np.nan
df_test[df_test== ' ?'] = np.nan
df_train.isnull().sum()
age 0 workclass 1836 education 0 education-num 0 marital-status 0 occupation 1843 relationship 0 race 0 sex 0 capital-gain 0 capital-loss 0 hours-per-week 0 native-country 582 y 0 dtype: int64
df_test.isnull().sum()
age 0 workclass 963 education 0 education-num 0 marital-status 0 occupation 966 relationship 0 race 0 sex 0 capital-gain 0 capital-loss 0 hours-per-week 0 native-country 274 y 0 dtype: int64
for col in ['workclass', 'occupation', 'native-country']:
df_train[col].fillna(df_train[col].mode()[0], inplace=True)
df_test[col].fillna(df_test[col].mode()[0], inplace=True)
Observation 2: All categories have some whitespace that can be removed and the target variable in the test set has both outcomes written in two ways (i.e >50K vs. >50K. and <=50K vs. <=50K.)
# Removing Whitespaces in categories
for cat in categories:
df_train[cat] = df_train[cat].str.strip()
df_test[cat] = df_test[cat].str.strip()
# Fixing target variable in test set
map = {'>50K.': '>50K', '<=50K.': '<=50K', '>50K': '>50K', '<=50K': '<=50K'}
df_test['y'] = [map[y] for y in df_test['y']]
Observation 3: There is one value for the native-country category in the train data which does not exist in the test data (i.e. Holand-Netherlands). Since only one entry in the train dataset is concerned and we don't want to risk data leakage we will simply remove this entry
df_train.drop(index=df_train[df_train['native-country'] == 'Holand-Netherlands'].index, inplace=True)
Lets confirm that all categories are identical across both the train and the test data
equal_list = []
for cat in categories:
list_train = np.sort(list(df_train[cat].unique()))
list_test = np.sort(list(df_test[cat].unique()))
equal_list.append(np.array_equal(list_train,list_test))
if all(equal_list):
print("All category values are equal in both train and test set")
All category values are equal in both train and test set
Observation 4: We find that the categorical variables don't really have an intrinsic order. One hot encoding is more reasonable and interpretable, however, one hot encoding would lead to a much higher feature space. Therefore we first combine some categories and then apply one hot encoding. Only the occupation feature can be interpreted as ordinal (in terms of expected saleries). We are using the same encoding as in Lab 3 from COMP0189 Applied AI
occupation
# Copied from COMP0189 Applied AI Lab 3
occ_mapping = {
'Priv-house-serv':0, 'Other-service':0,'Handlers-cleaners':0,
'Farming-fishing':1,'Machine-op-inspct':1,'Adm-clerical':1,
'Transport-moving':2,'Craft-repair':2,'Sales':2,
'Armed-Forces':3,'Tech-support':3,'Protective-serv':3,
'Prof-specialty':4,'Exec-managerial':4
}
df_train["occupation"] = df_train["occupation"].map(occ_mapping).astype('int64')
df_test["occupation"] = df_test["occupation"].map(occ_mapping).astype('int64')
education
df_train['education'].value_counts()
education HS-grad 10494 Some-college 7281 Bachelors 5353 Masters 1722 Assoc-voc 1382 11th 1175 Assoc-acdm 1067 10th 933 7th-8th 645 Prof-school 576 9th 514 12th 433 Doctorate 413 5th-6th 332 1st-4th 166 Preschool 50 Name: count, dtype: int64
# Combing all school level into one category
df_train['education'].replace(['12th', '11th', '9th', '7th-8th', '5th-6th', '10th', '1st-4th', 'Preschool'],
'School', inplace = True)
df_test['education'].replace(['12th', '11th', '9th', '7th-8th', '5th-6th', '10th', '1st-4th', 'Preschool'],
'School', inplace = True)
df_train['education'].value_counts()
education HS-grad 10494 Some-college 7281 Bachelors 5353 School 4248 Masters 1722 Assoc-voc 1382 Assoc-acdm 1067 Prof-school 576 Doctorate 413 Name: count, dtype: int64
race
df_train['race'].value_counts()
race White 27794 Black 3122 Asian-Pac-Islander 1038 Amer-Indian-Eskimo 311 Other 271 Name: count, dtype: int64
# Combing Asian-Pac-Islander Amer-Indian-Eskimo into other
df_train['race'].replace(['Asian-Pac-Islander', 'Amer-Indian-Eskimo'],
'Other', inplace = True)
df_test['race'].replace(['Asian-Pac-Islander', 'Amer-Indian-Eskimo'],
'Other', inplace = True)
df_train['race'].value_counts()
race White 27794 Black 3122 Other 1620 Name: count, dtype: int64
native-country
df_train['native-country'].value_counts()
native-country United-States 29735 Mexico 639 Philippines 198 Germany 137 Canada 121 Puerto-Rico 114 El-Salvador 106 India 100 Cuba 95 England 90 Jamaica 81 South 80 China 75 Italy 73 Dominican-Republic 70 Vietnam 67 Japan 62 Guatemala 62 Poland 60 Columbia 59 Taiwan 51 Haiti 44 Iran 43 Portugal 37 Nicaragua 34 Peru 31 France 29 Greece 29 Ecuador 28 Ireland 24 Hong 20 Cambodia 19 Trinadad&Tobago 19 Thailand 18 Laos 18 Yugoslavia 16 Outlying-US(Guam-USVI-etc) 14 Honduras 13 Hungary 13 Scotland 12 Name: count, dtype: int64
# Combing countries in continents and keeping USA seperate as most samples are from the US
usa = ["United-States"]
central_america =["Cuba", "Dominican-Republic", "El-Salvador", "Guatemala",
"Haiti", "Honduras", "Jamaica", "Mexico", "Nicaragua",
"Outlying-US(Guam-USVI-etc)", "Puerto-Rico", "Trinadad&Tobago"]
asia =["Cambodia", "China", "Hong", "India", "Iran", "Japan", "Laos",
"Philippines", "Taiwan", "Thailand", "Vietnam"]
south_america =["Columbia", "Ecuador", "Peru"]
europe =["England", "France", "Germany", "Greece", "Holand-Netherlands",
"Hungary", "Ireland", "Italy", "Poland", "Portugal", "Scotland",
"Yugoslavia"]
# Canada does not fit to the other continents and is not USA and therefore categorised as other. Only 121 are concered in train set
other =["South", "Canada"]
continents = [usa, central_america, asia, south_america, europe, other]
continent_names = ['usa', 'central-america', 'asia', 'south-america', 'europe', 'other']
for i, continent in enumerate(continents):
df_train['native-country'].replace(continent, continent_names[i], inplace = True)
df_test['native-country'].replace(continent, continent_names[i], inplace = True)
df_train['native-country'].value_counts()
native-country usa 29735 central-america 1291 asia 671 europe 520 other 201 south-america 118 Name: count, dtype: int64
marital-status
df_train['marital-status'].value_counts()
marital-status Married-civ-spouse 14970 Never-married 10666 Divorced 4441 Separated 1025 Widowed 993 Married-spouse-absent 418 Married-AF-spouse 23 Name: count, dtype: int64
# marital-status can be reduced to 'married', 'not married' and 'never married'
df_train['marital-status'].replace(['Married-civ-spouse', 'Married-spouse-absent', 'Married-AF-spouse'], 'married', inplace = True)
df_test['marital-status'].replace(['Married-civ-spouse', 'Married-spouse-absent', 'Married-AF-spouse'], 'married', inplace = True)
df_train['marital-status'].replace(['Divorced', 'Separated', 'Widowed'], 'not-married', inplace = True)
df_test['marital-status'].replace(['Divorced', 'Separated', 'Widowed'], 'not-married', inplace = True)
df_train['marital-status'].value_counts()
marital-status married 15411 Never-married 10666 not-married 6459 Name: count, dtype: int64
workclass:
df_train['workclass'].value_counts()
Private 24508 Self-emp-not-inc 2540 Local-gov 2093 State-gov 1298 Self-emp-inc 1116 Federal-gov 960 Without-pay 14 Never-worked 7 Name: workclass, dtype: int64
# workclass can be reduced to 'Goverment-sector', 'Unemployed', 'Self-employed', 'Private'
df_train['workclass'].replace(['Local-gov', 'Federal-gov', 'State-gov'], 'Goverment-sector', inplace = True)
df_test['workclass'].replace(['Local-gov', 'Federal-gov', 'State-gov'], 'Goverment-sector', inplace = True)
df_train['workclass'].replace(['Without-pay', 'Never-worked'], 'Unemployed', inplace = True)
df_test['workclass'].replace(['Without-pay', 'Never-worked'], 'Unemployed', inplace = True)
df_train['workclass'].replace(['Self-emp-not-inc', 'Self-emp-inc'], 'Self-employed', inplace = True)
df_test['workclass'].replace(['Self-emp-not-inc', 'Self-emp-inc'], 'Self-employed', inplace = True)
df_train['workclass'].value_counts()
Private 24508 Goverment-sector 4351 Self-employed 3656 Unemployed 21 Name: workclass, dtype: int64
Now let use plot the distribution of all variables and gain some insights
plt.figure(figsize=(30,20))
for i,column in enumerate(df_train.columns):
plt.subplot(5,3,i+1)
plt.title(f"Distribution of {column}")
sns.histplot(df_train[column])
plt.tight_layout()
plt.plot()
Key observations:
df_train['y'].value_counts()/df_train.shape[0]
<=50K 0.759067 >50K 0.240933 Name: y, dtype: float64
df_test['y'].value_counts()/df_test.shape[0]
<=50K 0.763701 >50K 0.236299 Name: y, dtype: float64
About 76% of all samples earn less 50K in both train and test data meaning the data is highly imbalanced. To take that into account for evaluating our model's performance we will use sklearn's balanced_accuracy_score as a main metric to get more meaningful results.
Lets pair plot the numerical variables against the target variable income.
sns.pairplot(df_train, hue="y", corner=True);
Key observations:
Handlers-cleaners, Service-Others) have low incomehours-per-week is normally distributed with mean 40Let's plot the correlation matrix for the numeric features
plt.figure(figsize=(15,10))
ax = sns.heatmap(df_train.corr(),annot = True)
#Fixing a bug in the representation
bottom, top = ax.get_ylim()
ax.set_ylim(bottom + 0.5, top - 0.5)
plt.title('Correlation Matrix')
plt.show()
There is a strong positive correlation between education-num and occupation (0.42). We should take that into account when models use those features to predict the income.
Less strong but positive correlations are between
education-num and hours-per-week (0.15)age and occupation (0.15)occupation and hours-per-week (0.13)education-num and capital-gain (0.12)The same correlations were found on the test set
Lets check for outliers
plt.figure(figsize=(10,6), dpi=80)
for i,column in enumerate(df_train[numerics]):
plt.subplot(2,3,i+1)
plt.title(f"Distribution of {column}")
ax = sns.boxplot(data=df_train, x=column)
plt.tight_layout()
plt.plot()
Most measurements seem reasonable except from capital-gain where some samples are far off (99999) from the other samples. We interpret that as measurement mistakes and replace by the median. The same observation was made on the test test
print(len(df_train['capital-gain'][df_train['capital-gain']==99999]))
print(len(df_test['capital-gain'][df_test['capital-gain']==99999]))
159 85
159 samples are affected in the train set and 85 in the test set. Lets replace them by the respective median
meadian_train = df_train['capital-gain'].median()
df_train['capital-gain'].replace(99999, meadian_train,inplace=True)
meadian_test = df_test['capital-gain'].median()
df_test['capital-gain'].replace(99999, meadian_test,inplace=True)
sns.boxplot(data=df_train, x='capital-gain')
<matplotlib.axes._subplots.AxesSubplot at 0x7f139af1eaf0>
Now we can encode all categorical variables using One Hot encoding and split the data.
# Saving a copy for better analysis later
df_train_pre_encoding = df_train.copy()
df_test_pre_encoding = df_test.copy()
# Encoding target variable
df_train['y'] = np.where(df_train['y'] == '>50K', 1, 0)
df_test['y'] = np.where(df_test['y'] == '>50K', 1, 0)
# Encoding sex
df_train['sex'] = np.where(df_train['sex'] == 'Male', 1, 0)
df_test['sex'] = np.where(df_test['sex'] == 'Female', 1, 0)
# Train data
X_train = df_train.drop(['y'], axis=1)
y_train = df_train['y']
# Test data
X_test = df_test.drop(['y'], axis=1)
y_test = df_test['y']
# Could be used for other encoding techniques
def encode(method, features, X_train, X_test):
if method == 'onehot-drop':
X_train_encoded = pd.get_dummies(X_train, drop_first = True, columns=features)
X_test_encoded = pd.get_dummies(X_test, drop_first = True, columns=features)
return X_train_encoded, X_test_encoded
elif method == 'onehot-nodrop':
X_train_encoded = pd.get_dummies(X_train, drop_first = False, columns=features)
X_test_encoded = pd.get_dummies(X_test, drop_first = False, columns=features)
return X_train_encoded, X_test_encoded
elif method == 'label':
le = LabelEncoder()
X_train_encoded = X_train.copy()
X_test_encoded = X_test.copy()
for cat in features:
X_train_encoded[cat] = le.fit_transform(X_train_encoded[cat])
X_test_encoded[cat] = le.transform(X_test_encoded[cat])
return X_train_encoded, X_test_encoded
else:
print('method not available')
# Encoding non binary attributes
categories_non_binary = ['workclass', 'education', 'marital-status', 'relationship', 'race', 'native-country'] # occupation is now numeric
# Here we use onehot encoding without dropping redundant categories to increase interpretability. Other methods are possible as well
X_train, X_test = encode('onehot-nodrop', categories_non_binary, X_train, X_test)
Lets plot the correlation matrix again with the one hot encoded variables
plt.figure(figsize=(30,30))
sns.heatmap(X_train.corr(),annot = True)
plt.title('Correlation Matrix')
plt.show()
One interesting correlation I want to point out is:
marital-status-married and sex (0.42)That means that being married strongly correlates with being male in this dataset.
Lets check for correlations with the target variable
plt.figure(figsize=(15,15))
pd.concat([X_train, y_train], axis=1).corr()['y'].sort_values()[:-1].plot.barh()
plt.title("Correlations with respect to income")
plt.show()
The most correlated features with income are:
marital-status_marriedrelationship_husbandeducation-numcapital-gainoccupationageThese are not causal relationships.
Decision Trees are interpretable by design models [1, Chapter 5.4]. Trees are invariant to feature scaling. Therefore we do not normalize the numeric features which will also help interpreting the models
# generic fitting
def fit(classifier, X_train, X_test, y_train):
clf = classifier
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
return y_pred
# using balanced accuracy to take imbalance into account
def evaluate(y_pred, y_test):
return round(balanced_accuracy_score(y_pred, y_test),3)
# printing the classification report
def report(y_pred, y_test):
print(classification_report(y_test, y_pred, target_names=['<=50K', '>50K']))
Let us first build a large Decision Tree. Since there 38 feature after one hot encoding the tree becomes enormous with default hyperparameter settings. Therefore we regard a tree with max_depth=10 and min_sample_split=1000 as large.
class_names = ['<=50K', '>50K']
feature_names = list(X_train.columns)
# Saving accuracies
acc = []
dt_large = DecisionTreeClassifier(criterion='gini', random_state=1, min_samples_split=1000, max_depth=10)
y_pred = fit(dt_large, X_train, X_test, y_train)
acc_dt_large = evaluate(y_pred, y_test)
acc.append(acc_dt_large)
print(dt_large, ' balanced accuracy: ', acc_dt_large)
DecisionTreeClassifier(max_depth=10, min_samples_split=1000, random_state=1) balanced accuracy: 0.819
Now the small Decision Tree will have max_depth=3.
dt_small = DecisionTreeClassifier(criterion='gini', random_state=1, max_depth=3)
y_pred = fit(dt_small, X_train, X_test, y_train)
acc_dt_small = evaluate(y_pred, y_test)
acc.append(acc_dt_small)
print(dt_small, ' balanced accuracy: ', evaluate(y_pred, y_test))
DecisionTreeClassifier(max_depth=3, random_state=1) balanced accuracy: 0.802
As black box model I choose a Gradient Boosting Tree (GBT) classifier. GBTs are known to perform well on tabular data . I have also tried some neural networks but the performance was quite poor compared to the Decision Trees. Therefore I choose another tree based algorithms but one that is highly intransparent. There is also a paper on why trees still outperform neural networks [3].
A GBT is a black box models because it combines a sequence of multiple decision trees. Each tree is opitimized to improve the mistakes made by the previous tree. Even though each individual tree can be analysed it becomes really hard to interprete the ensemble of trees, especially as there potentially hundreds of trees combined.
The hyperparameters were manually tuned because GridSearch was quite inefficient given the wide range hyperparameters they could take.
Parameters:
learning_rate=0.05max_depth=5n_estimators=200black_box = GradientBoostingClassifier(learning_rate= 0.05, max_depth=5, n_estimators=200)
y_pred = fit(black_box, X_train, X_test, y_train)
acc_black_box = evaluate(y_pred, y_test)
acc.append(acc_black_box)
print(black_box, ' balanced accuracy: ', acc_black_box)
GradientBoostingClassifier(learning_rate=0.05, max_depth=5, n_estimators=200) balanced accuracy: 0.854
First, let us plot the ROC curve of all classifiers we have build.
# Calculate the ROC curve and AUC for each tree
fpr1, tpr1, thresholds1 = roc_curve(y_test, dt_large.predict_proba(X_test)[:, 1])
fpr2, tpr2, thresholds2 = roc_curve(y_test, dt_small.predict_proba(X_test)[:, 1])
fpr3, tpr3, thresholds3 = roc_curve(y_test, black_box.predict_proba(X_test)[:, 1])
auc1 = round(auc(fpr1, tpr1),3)
auc2 = round(auc(fpr2, tpr2),3)
auc3 = round(auc(fpr3, tpr3),3)
aucs = [auc1, auc2, auc3]
# Plot the ROC curves
fig = plt.figure(figsize=(8,6))
plt.plot(fpr1, tpr1, label=f'Large DT 1 (AUC = {auc1:.2f})')
plt.plot(fpr2, tpr2, label=f'Small DT (AUC = {auc2:.2f})')
plt.plot(fpr3, tpr3, label=f'GBT (AUC = {auc3:.2f})')
plt.plot([0, 1], [0, 1], linestyle='--', color='grey')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend()
plt.show()
We find that all classifiers perform pretty well (AUC >= 0.85). The large Decsion Tree preforms very similar to the GBT motivating to rather use more interpretable Decsion Trees instead of a more complex black box model such as GBT. However due to the imbalance in the target variable we concentrate on balanced_accuracy_score to evaluate the models
algorithms = ['Large DT', 'Small DT', 'GBT']
table = pd.DataFrame({
'Algorithm': algorithms,
'Test Balanced Accuracy': acc,
'AUC': aucs
})
table.head()
| Algorithm | Test Balanced Accuracy | AUC | |
|---|---|---|---|
| 0 | Large DT | 0.819 | 0.899 |
| 1 | Small DT | 0.802 | 0.850 |
| 2 | GBT | 0.854 | 0.919 |
As for Balanced Accuracy all models perform quite well on the test set (>0.8). As expected the black box model GBT (0.855) performs significantly better than the Decision trees. Interestingly, the Large Decision Tree (0.819) only performs slightly better than the small Decision Tree (0.802).
Decision Trees are interpretable by Design meaning we can plot the tree and understand the decision making of the model.
fig = plt.figure(figsize=(60,30))
plot_tree(dt_large, filled=True, feature_names=feature_names, class_names=class_names)
plt.tight_layout()
plt.suptitle("Large Decsion Tree", fontsize=50)
plt.show()
Obviously the tree is too big to get an get an overall idea how the model determined its prediction. The color intensity of a node indicates its purity. So for example in dark organge nodes all samples are predicted as <=50K whereas for dark blue nodes all samples are predicted as >50K. The less intense the color the more impurity. Let us see the tree for the small Decision Tree classifier
fig = plt.figure(figsize=(30,10))
plot_tree(dt_small, filled=True, feature_names=feature_names, class_names=class_names)
plt.suptitle("Small Decsion Tree", fontsize=30)
plt.show()
The plot of the small decision tree is much more interpretable than the large one.
First of all we find that only three features are involved in the small tree (marital-status_married, capital-gain, education_num) which makes it very human understandable. Those features are very well aligned with the correlations discussed earlier. Despite only three features being used the leafs color are quite intense meaning the impurity is fairly low for the target variable income. First the model checks whether the a person is married or not. If not, then most samples are classified <=50K. For those classified as >50K capital-gain is the most impactful feature.
Note that the features do no necessarily imply causal relations with income. Spurious correlations might be involved. For example we found the being married strongly correlated with being male in the dataset. So indirectly the model might base its decisions on gender which would be inresponsible in a credit score setting
There is no comprehensive way to plot the GBT classifier as it combines hundreds of trees making it hard to interprete decisions. However we can extract feature importances.
Summary: Using the actual tree structure to compare the models the small decision tree is definitely the most interpretable one. There are other technique to also make the other model more interpretable. Let's start with feature importance
Let us first look at the feature importance for the trees we have built previously. Feature importance can indicate which features have greater influence on the model's decision for predicting the target variable income. Note that we only show the top 10 most important features
# From https://medium.com/chinmaygaikwad/feature-importance-and-visualization-of-tree-models-d491e8198b0a
feat_dict= {}
for col, val in sorted(zip(X_train.columns, dt_large.feature_importances_),key=lambda x:x[1],reverse=True):
feat_dict[col]=val
feat_df = pd.DataFrame({'Feature':feat_dict.keys(),'Importance':feat_dict.values()})
feat_df.head(10)
| Feature | Importance | |
|---|---|---|
| 0 | marital-status_married | 0.422247 |
| 1 | education-num | 0.232743 |
| 2 | capital-gain | 0.193302 |
| 3 | capital-loss | 0.046936 |
| 4 | age | 0.037802 |
| 5 | occupation | 0.031206 |
| 6 | hours-per-week | 0.026543 |
| 7 | education_Bachelors | 0.004309 |
| 8 | relationship_Not-in-family | 0.002509 |
| 9 | education_HS-grad | 0.002078 |
Here we find that for the large decision tree the most important feature is maritial-status_married. education-num and capital-gain are almost equally important. The rest is much less important to the classifier (< 5%). That can also explain why the small tree with depth 3 performs almost equally well.
feat_dict= {}
for col, val in sorted(zip(X_train.columns, dt_small.feature_importances_),key=lambda x:x[1],reverse=True):
feat_dict[col]=val
feat_df = pd.DataFrame({'Feature':feat_dict.keys(),'Importance':feat_dict.values()})
feat_df.head(10)
| Feature | Importance | |
|---|---|---|
| 0 | marital-status_married | 0.521211 |
| 1 | education-num | 0.243471 |
| 2 | capital-gain | 0.235318 |
| 3 | age | 0.000000 |
| 4 | occupation | 0.000000 |
| 5 | sex | 0.000000 |
| 6 | capital-loss | 0.000000 |
| 7 | hours-per-week | 0.000000 |
| 8 | workclass_Goverment-sector | 0.000000 |
| 9 | workclass_Private | 0.000000 |
For the small tree the distribution is almost identical to the large tree. But as depicted in the plot only three features were involved in the decision making process which makes the model easy to understand.
feat_dict= {}
for col, val in sorted(zip(X_train.columns, black_box.feature_importances_),key=lambda x:x[1],reverse=True):
feat_dict[col]=val
feat_df = pd.DataFrame({'Feature':feat_dict.keys(),'Importance':feat_dict.values()})
feat_df.head(10)
| Feature | Importance | |
|---|---|---|
| 0 | marital-status_married | 0.356008 |
| 1 | capital-gain | 0.194160 |
| 2 | education-num | 0.193512 |
| 3 | capital-loss | 0.066804 |
| 4 | age | 0.061058 |
| 5 | hours-per-week | 0.044636 |
| 6 | occupation | 0.042130 |
| 7 | relationship_Wife | 0.008528 |
| 8 | relationship_Husband | 0.006597 |
| 9 | education_School | 0.004365 |
For the black box model we can still extract feature importances. We find that similar to the decision trees marital-status_married, capital-gain and education_num are the most important features.
Summary: We find that in terms of feature importance all models show similar results. The reason behind that is that all models are tree-based algorithms.
PDPs can show whether the relationship between the target and a feature is linear, monotonous or more complex [1, Chapter 8.1] . PDPs are model agnostic, so we can apply them to all model including the black box model. Let us focus on numeric variables to detect more complex relationships
Limitation: PDP assume features to be uncorrelated which is not the case for these features, see correlation plot. Note that for the small decision tree only three features were used.
# From Lab 2
def plot_pdp(model, model_name, df, feature, cluster_flag=False, nb_clusters=None, lines_flag=False, grid_resolution=100):
# Create the data that we will plot
pdp_goals = pdp.pdp_isolate(model=model, dataset=df, model_features=df.columns.tolist(), feature=feature)
# plot it
pdp.pdp_plot(pdp_goals, feature, cluster=cluster_flag, n_cluster_centers=nb_clusters, plot_lines=lines_flag,
plot_pts_dist=True, plot_params={'title': f'Partial Dependence Plot {model_name} on {feature}'})
plt.show()
education-num
plot_pdp(dt_large, 'Large DT', X_train, 'education-num')
plot_pdp(dt_small, 'Small DT', X_train, 'education-num')
plot_pdp(black_box, 'GBT', X_train, 'education-num')
We find that for the number of year in education (education-num) there is a positive linear relation with income >50K in all models. Furthermore as most people have 7 years of education the effect on income is zero. The same observation can be made for all models
capital-gain
plot_pdp(dt_large, 'Small DT', X_train, 'capital-gain')
plot_pdp(dt_small, 'Large DT', X_train, 'capital-gain')
plot_pdp(black_box, 'GBT', X_train, 'capital-gain')
As for capital-gain both Decision trees show a positive linear relationship with income which is reasonable. The GBT classifier shows a very slight negative relationship. However that might be misleading as mentioned in the Interpretable ML book:
"Heterogeneous effects might be hidden. The PD curve could be a horizontal line, since the effects of both halves of the dataset could cancel each other out. You then conclude that the feature has no effect on the prediction" [1, Chapter 8.1.4]
One way to mitigate this problem is to plot Individual Conditional Expectation (ICE) curves which I leave for future work.
hours-per-week
plot_pdp(dt_large, 'Large DT', X_train, 'hours-per-week')
plot_pdp(black_box, 'GBT', X_train, 'hours-per-week')
feature_hist = 'hours-per-week'
plt.figure(figsize=(8,6))
sns.histplot(X_train[feature_hist])
plt.title(f'Distribution of {feature_hist}')
plt.show()
For 25 and 45 hours roughly we find a positve linear relationship between hours worked per week and income for both the large decision tree and GBT. From 50h onwards there seems to be no increase in high income. However as there are less data points this observation is unreliable. Note that the small decision tree does not use the ´hours-per-week´ feature which means the PDP would be constant at zero.
age
plot_pdp(dt_large, 'Large DT', X_train, 'age')
plot_pdp(black_box, 'GBT', X_train, 'age')
feature_hist = 'age'
plt.figure(figsize=(8,6))
sns.histplot(X_train[feature_hist])
plt.title(f'Distribution of {feature_hist}')
plt.show()
As for age we find a postive relationship to income from an age of 25. This trend flattens the higher the age for both the large decision tree and GBT. For GBT the there is even a slight negative relationship to income from 50 onwards. But again, the amount of data for older people is low and therefore this observation is not reliable.
SHAP (SHapley Additive exPlanations) is a method to explain individual predictions [1, Chapter 9.6] . SHAP is based on the game theoretic concept of Shapley Values. SHAP can help explain predicitons on a local level meaning individal samples. Let us randomly pick instance 4 in the test data and analyse it using SHAP. I will apply SHAP to the black box model GBT as it is the hardest to interpret and SHAP can give valuable insights
instance = df_test.iloc[[3]]
instance
| age | workclass | education | education-num | marital-status | occupation | relationship | race | sex | capital-gain | capital-loss | hours-per-week | native-country | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 44 | Private | Some-college | 10 | married | 1 | Husband | Black | 0 | 7688 | 0 | 40 | usa | 1 |
Some key characteristics of instance 4:
explainer_dt_small = shap.TreeExplainer(dt_small)
explainer_dt_large = shap.TreeExplainer(dt_large)
explainer_black_box = shap.TreeExplainer(black_box)
shap_values_dt_small = explainer_dt_small.shap_values(X_test)
shap_values_dt_large = explainer_dt_large.shap_values(X_test)
shap_values_black_box = explainer_black_box.shap_values(X_test)
shap.initjs()
shap.force_plot(explainer_black_box.expected_value, shap_values_black_box[3,:], X_test.iloc[3,:], matplotlib=True)
We find that the high value of capital-gain (7688) has the highest impact on correctly predicting high income (>50K). Appart from that being married and aged 44 also have positive impact on the prediction. Being female (sex=0), however, led to slight tendency to predict lower income. The same analysis can be made to any other instance in the dataset. Let us look at more global approaches using SHAP.
shap.initjs()
shap.force_plot(explainer_black_box.expected_value, shap_values_black_box[:1000,:], X_test.iloc[:1000,:])
This is an interactive plot using SHAP on the GBT classifier. It groups samples by similarity. Hovering over the plot with the mouse we can identify common characteristics.
Red group on the left: People predicted as high income commonly have high capital-gain (>7000) and are mostly married.
Red peak in the middle: People share high number of years in education (> 13) and being married.
Blue valley in the middle: Samples are predicted as low income. They share low age (< 25) and not being married, and low number of hours per week (< 30).
This plot is extremely helpful to identify common charateristcs and to explain a models prediction. Note that it is computationally very expensive. Thus we only plot 1000 samples from the test set. (Full set would take multiple minutes to compute)
shap.summary_plot(shap_values_black_box, X_test)
In this summary plot we find the especially being married, longer education as well as higher age lead to high income predicitons for the GBT classifier. High capital-gain also tend to result in higher income predictions. Again these relations are not causal and particularily facts on marriage most probably hide confounder variables (e.g. gender, finaical stability, having children, etc.)
LIME standw for Local interpretable model-agnostic explanations. It uses surrogate models
"Surrogate models are trained to approximate the predictions of the underlying black box model. Instead of training a global surrogate model, LIME focuses on training local surrogate models to explain individual predictions" [1, Chapter 9.2]
Let us perform LIME on the GBT again. The same would apply to the Decision Trees. Again we try to explain instance number 4
explainer = LimeTabularExplainer(np.array(X_train), feature_names=list(X_train.columns), class_names=['<=50K', '>50K'], mode='classification')
instance = X_test.iloc[3]
exp = explainer.explain_instance(instance, black_box.predict_proba)
exp.show_in_notebook(show_table=True)
Here we can see that GBT predicts with 98% certainty that sample number 4 has >50K income. Addiontally we can see the most important featues that contributed to that predition, namely high capital-gain at most. Additionally we can find the the feature values for this sample on the right table and whether a feature had positve impact on the income prediction (orange) or negative impact (blue). The same anaysis could be made for the other claassifiers and various instances
Let's use LIME on the same instance using the Decision Trees
explainer = LimeTabularExplainer(np.array(X_train), feature_names=list(X_train.columns), class_names=['<=50K', '>50K'], mode='classification')
instance = X_test.iloc[3]
exp = explainer.explain_instance(instance, dt_small.predict_proba)
exp.show_in_notebook(show_table=True)
explainer = LimeTabularExplainer(np.array(X_train), feature_names=list(X_train.columns), class_names=['<=50K', '>50K'], mode='classification')
instance = X_test.iloc[3]
exp = explainer.explain_instance(instance, dt_large.predict_proba)
exp.show_in_notebook(show_table=True)
We find similar explanations for the decision trees. However the fact that this instance is married had higher positive impact on the prediction for the decision trees than for GBT
First we identify senitive features:
sexraceAdditionally we can study the intersection of sex and race.
sex_train = df_train_pre_encoding['sex']
sex_test = df_test_pre_encoding['sex']
race_train = df_train_pre_encoding['race']
race_test = df_test_pre_encoding['race']
intersection_train = df_train_pre_encoding[['sex', 'race']]
intersection_test = df_test_pre_encoding[['sex', 'race']]
As a fairness metrics I choose selection_rate from the fairlearn framework. It calculates the fraction of predicted labels matching the ‘good’ outcome, which is >50K income in this case. We can also think of the good outcome as a high credit score since this data could be used for that as well. We will analyse the Large Decision Tree first, then the small Decision Tree and then the Black box model (GBT)
Large Decision Tree
# ideas from https://fairlearn.org/main/auto_examples/plot_grid_search_census.html
def plot_fairness(model, model_name, sensitive_features_test, sensitive_features_names, print_report=False):
# Defining the metrics
metric_frame = MetricFrame(
metrics={
"selection_rate": selection_rate,
"balanced_accuracy": balanced_accuracy_score,
"count": count,
},
sensitive_features=sensitive_features_test,
y_true=y_test,
y_pred=model.predict(X_test),
)
if print_report:
print(metric_frame.by_group)
metric_frame.by_group.plot.bar(
subplots=True,
layout=[3, 1],
legend=False,
figsize=[12, 8],
title=f"{model_name} grouped by {sensitive_features_names}",
)
plt.show()
# we need a differnt plot method for the fair model because we use a ThresholdOptimizer for postprocessing a model
# it requires the sensitive features for predicting
def plot_fairness_fair_model(model, model_name, sensitive_features_test, sensitive_features_names, print_report=False, no_count=False):
# Doesnt show the count plot
if no_count:
metric_frame = MetricFrame(
metrics={
"selection_rate": selection_rate,
"balanced_accuracy": balanced_accuracy_score,
},
sensitive_features=sensitive_features_test,
y_true=y_test,
y_pred=model.predict(X_test, sensitive_features=sensitive_features_test),
)
else:
metric_frame = MetricFrame(
metrics={
"selection_rate": selection_rate,
"balanced_accuracy": balanced_accuracy_score,
"count": count,
},
sensitive_features=sensitive_features_test,
y_true=y_test,
y_pred=model.predict(X_test, sensitive_features=sensitive_features_test),
)
#prints concrete numbers
if print_report:
print(metric_frame.by_group)
#plotting the results
metric_frame.by_group.plot.bar(
subplots=True,
layout=[3, 1],
legend=False,
figsize=[12, 8],
title=f"{model_name} grouped by {sensitive_features_names}",
)
plt.show()
plot_fairness(model=dt_large,
model_name="Large DT",
sensitive_features_test=sex_test,
sensitive_features_names="sex",
print_report=True,
)
selection_rate balanced_accuracy count sex Female 0.059410 0.705362 5420.0 Male 0.227616 0.750474 10856.0
plot_fairness(model=dt_large,
model_name="Large DT",
sensitive_features_test=race_test,
sensitive_features_names="race",
print_report=True
)
selection_rate balanced_accuracy count race Black 0.076874 0.721612 1561.0 Other 0.192506 0.767084 774.0 White 0.181049 0.750570 13941.0
plot_fairness(model=dt_large,
model_name="Large DT",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True
)
selection_rate balanced_accuracy count
sex race
Female Black 0.033201 0.658931 753.0
Other 0.081272 0.754666 283.0
White 0.062500 0.705929 4384.0
Male Black 0.117574 0.736846 808.0
Other 0.256619 0.758028 491.0
White 0.235430 0.749687 9557.0
Large Decision Tree Summary:
Looking at the selection_rate (blue) men are almost four times as likely to be predicted high income compared to women. Additionally White people are more than twice as likely to be predicted high income compared to other race.
As for intersectionality white men have a selection rate twice as high as all other combinations of features (execpt from (Male,Other))
Assuming the model would be deployed for credit score prediction this disparity would not be acceptable from a fariness point of view. The balanced accuracy is pretty fair accross both features.
Small Decision Tree
plot_fairness(model=dt_small,
model_name="Small DT",
sensitive_features_test=sex_test,
sensitive_features_names="sex",
print_report=True
)
selection_rate balanced_accuracy count sex Female 0.064760 0.710919 5420.0 Male 0.206245 0.718796 10856.0
plot_fairness(model=dt_small,
model_name="Small DT",
sensitive_features_test=race_test,
sensitive_features_names="race",
print_report=True
)
selection_rate balanced_accuracy count race Black 0.071749 0.705576 1561.0 Other 0.196382 0.735272 774.0 White 0.166846 0.723541 13941.0
plot_fairness(model=dt_small,
model_name="Small DT",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True
)
selection_rate balanced_accuracy count
sex race
Female Black 0.033201 0.684147 753.0
Other 0.098940 0.694484 283.0
White 0.067974 0.713848 4384.0
Male Black 0.107673 0.707648 808.0
Other 0.252546 0.736235 491.0
White 0.212200 0.717512 9557.0
Small Decision Tree Summary:
Again we identify a strong bias against women and black black people in the selection_rate. As compared to the large decision tree the diffenrence in selection_rate is slitghly less but still significant. Both models perform similarily unfair on sex and race intersectionality.
GBT
plot_fairness(model=black_box,
model_name="GBT",
sensitive_features_test=sex_test,
sensitive_features_names="sex",
print_report=True
)
selection_rate balanced_accuracy count sex Female 0.083764 0.767776 5420.0 Male 0.161385 0.722758 10856.0
plot_fairness(model=black_box,
model_name="GBT",
sensitive_features_test=race_test,
sensitive_features_names="race",
print_report=True
)
selection_rate balanced_accuracy count race Black 0.062140 0.711003 1561.0 Other 0.140827 0.727337 774.0 White 0.143462 0.730538 13941.0
plot_fairness(model=black_box,
model_name="GBT",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True
)
selection_rate balanced_accuracy count
sex race
Female Black 0.035857 0.670133 753.0
Other 0.095406 0.746634 283.0
White 0.091241 0.776705 4384.0
Male Black 0.086634 0.724711 808.0
Other 0.167006 0.722581 491.0
White 0.167417 0.722216 9557.0
GBT Summary:
Also the black box model is biased towards men when it comes to income prediction. Male are about twice as likely to be predicted high income than female. Compared to the Decision Trees that is much more fair but still inacceptable. On intersectionality the black box model is also slightly more fair than the Decision Trees.
We conclude that the Large Decision Tree is the least fair model.
The disparity in selection for race and sex and their intersection are higher than the other two model.
print('Least Fair Model: ', str(dt_large))
print('Balanced Accuracy: ', round(balanced_accuracy_score(dt_large.predict(X_test),y_test),3))
print('Difference in Selection Rate on intersection: ', round(selection_rate_difference(y_test, dt_large.predict(X_test), sensitive_features=intersection_test),3))
Least Fair Model: DecisionTreeClassifier(max_depth=10, min_samples_split=1000, random_state=1) Balanced Accuracy: 0.819 Difference in Selection Rate on intersection: 0.223
pd_least_fair = pd.DataFrame.from_dict([{'Model': dt_large,
'Balanced Accuracy': balanced_accuracy_score(dt_large.predict(X_test),y_test),
'Difference in Selection Rate': selection_rate_difference(y_test, dt_large.predict(X_test), sensitive_features=intersection_test)
}])
# Printing the full row
pd.set_option('max_colwidth', 100)
pd_least_fair
| Model | Balanced Accuracy | Difference in Selection Rate | |
|---|---|---|---|
| 0 | DecisionTreeClassifier(max_depth=10, min_samples_split=1000, random_state=1) | 0.818539 | 0.223419 |
One interesting observation is that even though sex and race were almost not involved in the prediction (e.g. see feature importance section) all models are strongly biased against those sensitive features. This shows that simply droping sensitive features from the data does not help making models more fair. The reason is that the other features are still correlated with those sensitive features. (e.g. marital-status and sex).
Now we use the ThresholdOptimizer (documentation) from the fairlearn framework to find a model that mitigates the fairness issues discovered previously.
ThresholdOptimizer is a postprocessing technique that takes a fitted model and a set of sensitive features as input This way the model's preditions are adjusted to fulfill specified parity constraints. Since we use selection_rate as a fairness metric we choose selection_rate_parity as a constraint. Therefore the model is optimized on equal opportunities independent of a sensitive feature.
We postprocess all three existing models given the intersection of race and sex and then conclude which model is the most fair one.
# Initializing the ThresholdOptimizer object.
dt_large_fair = ThresholdOptimizer(estimator=dt_large, constraints="selection_rate_parity",objective='balanced_accuracy_score')
# Fitting the new model given the sensitive features
dt_large_fair.fit(X_train, y_train, sensitive_features=intersection_train)
# Predicting giving the sensitive features
y_pred_dt_large_fair = dt_large_fair.predict(X_test, sensitive_features=intersection_test)
# Initializing the ThresholdOptimizer object.
dt_small_fair = ThresholdOptimizer(estimator=dt_small, constraints="selection_rate_parity",objective='balanced_accuracy_score')
# Fitting the new model given the sensitive features
dt_small_fair.fit(X_train, y_train, sensitive_features=intersection_train)
# Predicting giving the sensitive features
y_pred_dt_small_fair = dt_small_fair.predict(X_test, sensitive_features=intersection_test)
# Initializing the ThresholdOptimizer object.
black_box_fair = ThresholdOptimizer(estimator=black_box, constraints="selection_rate_parity",objective='balanced_accuracy_score')
# Fitting the new model given the sensitive features
black_box_fair.fit(X_train, y_train, sensitive_features=intersection_train)
# Predicting giving the sensitive features
y_pred_black_box_fair = black_box_fair.predict(X_test, sensitive_features=intersection_test)
plot_fairness_fair_model(model=dt_large_fair,
model_name="Large DT Fair Model",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True,
no_count=True,
)
selection_rate balanced_accuracy
sex race
Female Black 0.237716 0.588457
Other 0.268551 0.681668
White 0.241560 0.667320
Male Black 0.220297 0.692577
Other 0.256619 0.758028
White 0.235743 0.749703
plot_fairness_fair_model(model=dt_small_fair,
model_name="Small DT Fair Model",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True,
no_count=True,
)
selection_rate balanced_accuracy
sex race
Female Black 0.247012 0.621358
Other 0.233216 0.651606
White 0.217153 0.680042
Male Black 0.216584 0.677233
Other 0.211813 0.690971
White 0.212828 0.717300
plot_fairness_fair_model(model=black_box_fair,
model_name="GBT Fair Model",
sensitive_features_test=intersection_test,
sensitive_features_names="race and sex",
print_report=True,
no_count=True,
)
selection_rate balanced_accuracy
sex race
Female Black 0.253652 0.643058
Other 0.289753 0.736475
White 0.261405 0.713381
Male Black 0.215347 0.704347
Other 0.189409 0.706776
White 0.166998 0.721788
Comparision of Fair Models:
The mitigated models all perform more fair in terms of selection_rate than the unmitigated ones. However there is a significatn decrease in accuracy. Interestingly the GBT is now slightly biased against men. Both decision trees perform very fair on the intersection of race and sex.
classifiers = [dt_large, dt_small, black_box]
classifiers_fair = [dt_large_fair, dt_small_fair, black_box_fair]
classifiers_names = ['Large DT', 'Small DT', 'GBT', 'Fair Large DT', 'Fair Small DT', 'Fair GBT']
# saving the performances
accuracies = []
# saving selection rate parities
parities = []
for clf in classifiers:
y_pred = clf.predict(X_test)
acc = balanced_accuracy_score(y_pred, y_test)
parity = selection_rate_difference(y_test, y_pred, sensitive_features=intersection_test)
parities.append(parity)
accuracies.append(acc)
for clf in classifiers_fair:
y_pred = clf.predict(X_test, sensitive_features=intersection_test)
acc = balanced_accuracy_score(y_pred, y_test)
parity = selection_rate_difference(y_test, y_pred, sensitive_features=intersection_test)
parities.append(parity)
accuracies.append(acc)
fig, ax = plt.subplots(figsize=(8, 6))
ax.scatter(accuracies, parities)
#Annotating the scatter points
ax.annotate(classifiers_names[0], (accuracies[0], parities[0]), xytext=(accuracies[0]+0.05, parities[0]+0.3),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
ax.annotate(classifiers_names[1], (accuracies[1], parities[1]), xytext=(accuracies[1]-0.3, parities[1]+0.1),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
ax.annotate(classifiers_names[2], (accuracies[2], parities[2]), xytext=(accuracies[2]+0.05, parities[2]+0.3),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
ax.annotate(classifiers_names[3], (accuracies[3], parities[3]), xytext=(accuracies[3]-0.3, parities[3]+0.1),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
ax.annotate(classifiers_names[4], (accuracies[4], parities[4]), xytext=(accuracies[4]-0.3, parities[4]+0.05),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
ax.annotate(classifiers_names[5], (accuracies[5], parities[5]), xytext=(accuracies[5]-0.3, parities[5]+0.1),
arrowprops = dict(arrowstyle="->",
connectionstyle="angle3,angleA=0,angleB=-90"))
plt.ylim(0,1)
plt.xlim(0,1)
plt.xlabel('Balanced Accuracy')
plt.ylabel('Selection Rate Difference')
plt.title('Tradeoff Fairness vs Accuracy')
plt.show()
We find that there is a trade off between accuracy and fairness. All mitigated models perform worse on the test set in terms of balanced accurcy compared to the unmitageted models. On the other hand all mitigated models have a lower difference in selection rate and are therefore more fair than others. Looking at pareto optimality all models are dominted by Fair Small DT and GBT.
From the plot above we find that the Fair Small Decision Tree is the most fair model. It is more fair at the cost of accuracy. For a credit score system, however, fairness might be more crucial than accuracy.
pd_most_fair = pd.DataFrame.from_dict([{'Model': dt_small_fair,
'Balanced Accuracy': balanced_accuracy_score(dt_small_fair.predict(X_test, sensitive_features=intersection_test),y_test),
'Difference in Selection Rate': selection_rate_difference(y_test, dt_small_fair.predict(X_test, sensitive_features=intersection_test), sensitive_features=intersection_test)
}])
# Printing the full row
pd.set_option('max_colwidth', 300)
pd_most_fair
| Model | Balanced Accuracy | Difference in Selection Rate | |
|---|---|---|---|
| 0 | ThresholdOptimizer(constraints='selection_rate_parity',\n estimator=DecisionTreeClassifier(max_depth=3,\n random_state=1),\n objective='balanced_accuracy_score') | 0.710012 | 0.031483 |
Comparing the least and the most fair model side by side:
pd_overview = pd.concat([pd_least_fair, pd_most_fair])
pd_overview
| Model | Balanced Accuracy | Difference in Selection Rate | |
|---|---|---|---|
| 0 | DecisionTreeClassifier(max_depth=10, min_samples_split=1000, random_state=1) | 0.818539 | 0.223419 |
| 0 | ThresholdOptimizer(constraints='selection_rate_parity',\n estimator=DecisionTreeClassifier(max_depth=3,\n random_state=1),\n objective='balanced_accuracy_score') | 0.710012 | 0.031483 |
As for all Machine Learning Models they can only be as good as the data they are trained on. So most probably the unmitigated models take over biases that already exist in the data. In this section we will analyse the data for biases and fairness. Apart from race and sex we will also include native-country in this analysis.
First of all we try to identify any sampling biases in the dataset
# defining sensitive features. Let us also include native-country
sensitives = ['sex', 'race', 'native-country']
#Analysing the full data
full_data = pd.concat([df_train_pre_encoding, df_test_pre_encoding])
plt.figure(figsize=(15,10))
for i,sensitive in enumerate(sensitives):
# We are using the string categories for better readability
counts = full_data[sensitive].value_counts()
labels = counts.index
plt.subplot(1,len(sensitives),i+1)
plt.pie(counts, startangle=90, autopct='%1.1f%%')
plt.legend(labels, loc=2,fontsize=8)
plt.title(f"Distribution of {sensitive}")
plt.tight_layout()
plt.plot()
Key observations:
Accoring to the survey on Bias and Fairness [2] we have found Sampling Bias in the data
Now we can analyse the prevalence for sensitve features. Assuming fair data the proportion in the population for high vs. low income should be the same regardless of race, sex or native country. We can test that by comparing prevalencess for different features
# Similar to Lab 1
df_fair = full_data[['race','sex', 'native-country']]
#Define protected features
df_fair['priv_race'] = [1 if r=='White' else 0 for r in df_fair['race']]
df_fair['priv_sex'] = [1 if s=='Male' else 0 for s in df_fair['sex']]
df_fair['priv_native_country'] = [1 if s=='usa' else 0 for s in df_fair['native-country']]
#Define target variable
df_fair['y'] = [1 if y == '>50K'else 0 for y in full_data['y']]
#Saving results in prevalences and plotting them later
prevalences = []
#Calculate the prevelance of target variable of y
prev = df_fair['y'].mean()
prevalences.append(prev)
####
#Calculate the prevelance of priv_race with respect to y
prev_race = df_fair.groupby('priv_race')['y'].mean().values
prevalences.append(prev_race[0])
prevalences.append(prev_race[1])
####
#Calculate the prevelance of priv_Sex with respect to y
prev_sex = df_fair.groupby('priv_sex')['y'].mean().to_numpy()
prevalences.append(prev_sex[0])
prevalences.append(prev_sex[1])
####
#Calculate the prevelance of priv_native_country with respect to y
prev_native_country = df_fair.groupby('priv_native_country')['y'].mean().to_numpy()
prevalences.append(prev_native_country[0])
prevalences.append(prev_native_country[1])
####
#Calculate the combined prevelance of priv_race, priv_sex with and priv_native_country respect to y
prev_comb = df_fair.groupby(['priv_race','priv_sex', 'priv_native_country'])['y'].mean()
#normalizing prev_comb
prev_comb_vector = prev_comb.to_numpy()
prev_comb_vector_normalized = prev_comb_vector / sum(prev_comb_vector)
fig = plt.figure(figsize=(8,6))
ax = fig.add_axes([0,0,1,1])
prevalences
categories = ['Overall', 'non-white', 'white', 'female', 'male', 'foreign born', 'US born']
ax.bar(categories,prevalences)
plt.xlabel('Feature Condition')
plt.ylabel('Prevalence')
plt.title('Prevelence conditioned on sensitve data')
plt.show()
We can see that for the white and male population the prevalence (i.e. the proportion of income) for >50K income is higher than for the overall population. Furthermore the proportion for higher income for back, female and foreign born population is significantly lower than the overall population. Consequently the data is biased with respect to prevalence
labels = ['non-white, female, foreign born',
'non-white, female, US born',
'non-white, male, foreign born',
'non-white, male, US born',
'white, female, foreign born',
'white, female, US born',
'white, male, foreign born',
'white, male, US born',
]
plt.figure(figsize=(15,10))
plt.pie(prev_comb_vector_normalized, startangle=90, autopct='%1.1f%%')
plt.legend(labels, loc=2,fontsize=12)
plt.title('Demographic distribution for high income (>50K) samples')
plt.tight_layout()
plt.plot()
[]
Here we can see the distribution of the intersection of sensitve demographics conditioned on the high income (>50K) class. We find that almost one in four of those with high income are white, male and US born. Also, all female categories are underrepresented in the high income class (< 10% each)
Accoring to the survey on Bias and Fairness [2] we have found Historical Bias in the data.
It is really hard to mitigate historical biases. For example the fact that women are paid less is a complex economic and political problem which is hard to fix. As for the sampling biases, however, there are many techniques to ensure that there is no under or overrepresentation in the data (e.g. the imbalance in gender could easily be mitigated using the cluster sampling technique)
For the counterfactual analysis I am using the Dice_ml framework. It allows us to change certain features such that the models output changes from low income to high income or vice versa. This way we can find out how an individual would need to change in order to change the model's prediction. As actionalbe feature we specify:
education-numhours-per-weekWe restrict the range of the features to this maximun and minimum occurance in the whole dataset. This way we ensure no unrealistic counterfactuals.
Unfortunately the ml_dice framework only supports models built from common libraries such as sklearn or tensorflow. The fair model we have built is postprocessed using fairlearn specic package and is therefore not supported in the dice_ml framework. We can still do some manual analysis.
# Recombining X_train and y_train to create the Dice data
data = pd.concat([X_train, y_train], axis=1)
# Dataset for training an ML model using the dice_ml.Data object
dice_data = Data(dataframe=data,
continuous_features=['age',
'education-num',
'occupation',
'capital-gain',
'capital-loss',
'hours-per-week',
'sex'
],
outcome_name='y')
# creating a dice_ml.Model object for the least fair model
dice_dt_large = Model(model=dt_large,
backend='sklearn')
# creating a dice_ml.Dice object (explainer) for the least fair model
exp_dt_large = Dice(dice_data,dice_dt_large, method='random')
# here we can retrive a sample and change some features if we want to
def get_instance(sample_id, education_num=None, hours_per_week=None, occupation=None, sex=None, race=None):
#making sure to not change the original data
sample = X_test.iloc[[sample_id]].copy()
if education_num is not None:
sample['education-num']=education_num
if hours_per_week is not None:
sample['hours-per-week']=hours_per_week
if occupation is not None:
sample['occupation']=occupation
if sex is not None:
sample['sex']=sex
if race is not None:
# Race can be 'Black' or 'White'
if race=='Black':
sample['race_White'] = 0
sample['race_Other'] = 0
sample['race_Black'] = 1
elif race=='White':
sample['race_White'] = 1
sample['race_Other'] = 0
sample['race_Black'] = 0
else:
print('Choose Black or White')
return sample
# See all columns when plotting a dataframe
pd.set_option('display.max_columns', None)
# this method will visualse some counterfactuals in order to change the prediction of a specific sample
def visualize_cf(explainer, sample_id, features_to_vary, total_CFs, education_num=None, hours_per_week=None, occupation=None, sex=None, race=None):
# Check a specific sample
query_instance = get_instance(sample_id, education_num=education_num, hours_per_week=hours_per_week, occupation=occupation, sex=sex, race=race)
minmax = []
for feature in features_to_vary:
minmax.append([min(data[feature]),max(data[feature])])
permitted_range=dict(zip(features_to_vary,minmax))
print(permitted_range)
# Now generating explanations using the predefined features.
cf = explainer.generate_counterfactuals(query_instance,
total_CFs=total_CFs,
desired_class="opposite",
permitted_range=permitted_range,
features_to_vary=features_to_vary)
# Visualize it
cf.visualize_as_dataframe(show_only_changes=True)
Example 1
# feasible features to vary:
features_to_vary=['education-num',
#'sex',
#'hours-per-week',
#'occupation'
]
visualize_cf(explainer=exp_dt_large,
sample_id=13,
features_to_vary=features_to_vary,
total_CFs=5,
education_num=None,
hours_per_week=None,
occupation=None,
sex=None,
race=None,
)
{'education-num': [1, 16]}
100%|██████████| 1/1 [00:00<00:00, 6.33it/s]
Query instance (original outcome : 0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 58 | 9 | 4 | 0 | 0 | 0 | 35 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Diverse Counterfactual set (new outcome: 1.0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | - | 11 | - | - | - | - | - | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 1 | - | 15 | - | - | - | - | - | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 2 | - | 14 | - | - | - | - | - | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 3 | - | 16 | - | - | - | - | - | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 4 | - | 10 | - | - | - | - | - | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Example 2
# feasible features to vary:
features_to_vary=[#'education-num',
#'sex',
'hours-per-week',
#'occupation'
]
visualize_cf(explainer=exp_dt_large,
sample_id=100,
features_to_vary=features_to_vary,
total_CFs=5,
education_num=None,
hours_per_week=None,
occupation=None,
sex=None,
race=None,
)
{'hours-per-week': [1, 99]}
100%|██████████| 1/1 [00:00<00:00, 4.98it/s]
Query instance (original outcome : 0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 51 | 10 | 4 | 1 | 0 | 0 | 18 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Diverse Counterfactual set (new outcome: 1.0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | - | - | - | - | - | - | 96 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 1 | - | - | - | - | - | - | 61 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 2 | - | - | - | - | - | - | 59 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 3 | - | - | - | - | - | - | 75 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 4 | - | - | - | - | - | - | 76 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Analysis
Here we can see two examples where the model predicted low income. However, increasing the number of years in education while holding the other features constant would change the model's prediction to high income (>50K). Similarily the second example shows that for some people increasing the number of hours worked per week can change the models prediction to high income. If this was a real credit score application, one could suggest to invest in further education or increase the working hours in order to have better chances for a high credit score.
Example 3
# feasible features to vary:
features_to_vary=[#'education-num',
#'sex',
'hours-per-week',
#'occupation'
]
visualize_cf(explainer=exp_dt_large,
sample_id=7,
features_to_vary=features_to_vary,
total_CFs=5,
education_num=None,
hours_per_week=None,
occupation=None,
sex=None,
race=None,
)
{'hours-per-week': [1, 99]}
100%|██████████| 1/1 [00:00<00:00, 6.18it/s]
Query instance (original outcome : 1)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 63 | 15 | 4 | 0 | 3103 | 0 | 32 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Diverse Counterfactual set (new outcome: 0.0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | - | - | - | - | - | - | 12 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | - | - | - | - | - | - | 31 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 2 | - | - | - | - | - | - | 13 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 3 | - | - | - | - | - | - | 24 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 4 | - | - | - | - | - | - | 16 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Example 4
# feasible features to vary:
features_to_vary=['education-num',
#'sex',
#'hours-per-week',
#'occupation'
]
visualize_cf(explainer=exp_dt_large,
sample_id=7,
features_to_vary=features_to_vary,
total_CFs=5,
education_num=None,
hours_per_week=None,
occupation=None,
sex=None,
race=None,
)
{'education-num': [1, 16]}
100%|██████████| 1/1 [00:00<00:00, 5.36it/s]
Query instance (original outcome : 1)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 63 | 15 | 4 | 0 | 3103 | 0 | 32 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Diverse Counterfactual set (new outcome: 0.0)
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | - | 6 | - | - | - | - | - | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 1 | - | 8 | - | - | - | - | - | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 2 | - | 11 | - | - | - | - | - | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 3 | - | 12 | - | - | - | - | - | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| 4 | - | 5 | - | - | - | - | - | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
Analysis
Here we find the opposite case. In example 3 the model has predicted high income but if this person reduced the number of working hours per week the modekl would have predicted low income. In example 4 the same counterfactual trend holds for years in education
We now analyse the fair model manually and check if an increase in the number of years in education would make a change in the prediction of the model. The code is designed to test other counterfactuals as well.
# returns a copy of the Test set with exchanging the 'sample_id'th entry by some sample
def change_data(sample_id, sample):
X_test_changed=X_test.copy()
X_test_changed.iloc[[sample_id]] = sample
return X_test_changed
# predicting a certain instance with variable counterfactual features using the most fair model (dt_small_fair)
def predict_cf_fair(sample_id, education_num=None, hours_per_week=None, occupation=None, sex=None, race=None):
sample = get_instance(sample_id, education_num=education_num, hours_per_week=hours_per_week, occupation=occupation, sex=sex, race= race)
X_test_changed = change_data(sample_id, sample)
y_pred_ = dt_small_fair.predict(X_test_changed, sensitive_features=intersection_test)
return y_pred[sample_id]
# Predictions for the first 100 entries and setting eduction_num=14 while keeping the rest constant
preds_cf_fair = [predict_cf_fair(x, education_num=14) for x in range(100)]
# Usual predictions for the first 100 entries
preds_fair = dt_small_fair.predict(X_test, sensitive_features=intersection_test)[:100]
# indicies where predicitions differ
indicies = np.squeeze(np.where(preds_cf_fair != preds_fair))
print(f'Setting education_num=14 changed {len(indicies)} predictions for the first 100 test samples')
Setting education_num=14 changed 13 predictions for the first 100 test samples
# Inspecting one single instance
instance_id = indicies[1]
instance = X_train.iloc[[instance_id]]
instance
| age | education-num | occupation | sex | capital-gain | capital-loss | hours-per-week | workclass_Goverment-sector | workclass_Private | workclass_Self-employed | workclass_Unemployed | education_Assoc-acdm | education_Assoc-voc | education_Bachelors | education_Doctorate | education_HS-grad | education_Masters | education_Prof-school | education_School | education_Some-college | marital-status_Never-married | marital-status_married | marital-status_not-married | relationship_Husband | relationship_Not-in-family | relationship_Other-relative | relationship_Own-child | relationship_Unmarried | relationship_Wife | race_Black | race_Other | race_White | native-country_asia | native-country_central-america | native-country_europe | native-country_other | native-country_south-america | native-country_usa | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 14 | 40 | 11 | 2 | 1 | 0 | 0 | 40 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
print('Original prediction using the fair model', preds_fair[instance_id])
print('Counterfactual prediction (education=14) using the fair model', preds_cf_fair[instance_id])
Original prediction using the fair model 0 Counterfactual prediction (education=14) using the fair model 1
Here we can see how the fair model changes the prediction from low income to high income if the amount of education would have increased from 11 to 14 years. Out of the first 100 samples 13 predictions would have changed by setting the number of years in education to 14
age: continuous variablefnlwgt: final weight, continuous variable, the number of people the census believes the entry represents. Normally dropped.workclass: categorical variable indicating the person's type of employment (e.g., Private, Self-emp-not-inc, Local-gov)education: categorical variable indicating the highest level of education attained (e.g., Bachelors, Masters, Doctorate)education-num: continuous variable indicating the number of years of educationmarital-status: categorical variable indicating the person's marital status (e.g., Married-civ-spouse, Divorced, Never-married)occupation: categorical variable indicating the person's occupation (e.g., Exec-managerial, Craft-repair, Other-service)relationship: categorical variable indicating the person's relationship status (e.g., Husband, Not-in-family, Unmarried)race: categorical variable indicating the person's race (e.g., White, Black)sex: categorical variable indicating the person's sex (e.g., Male, Female)capital-gain: continuous variable indicating the amount of capital gain (e.g. sold stocks, real estate, etc.)capital-loss: continuous variable indicating the amount of capital losshours-per-week: continuous variable indicating the number of hours worked per weeknative-country: categorical variable indicating the person's country of origin (e.g., United-States, Mexico, India)income: categorical variable indicating whether the person's income exceeds $50K per year (<=50K, >50K)workclass, occupation and native-country. Around 6% of the data is concerned.race, sex and native-country personal information on name, birthday, etc. is not provided. Therefore there is no link to individuals. Please find a respective analysis of their distribution in the preprocessing part.((AAGE>16) && (AGI>100) && (AFNLWGT>1) && (HRSWK>0))names = ['age','workclass','fnlwgt','education','education-num', 'marital-status','occupation','relationship','race','sex', 'capital-gain','capital-loss','hours-per-week','native-country', 'y']fnlwgtfeature' ?'>50K vs. >50K. and <=50K vs. <=50K.age from object to float or intBasic Information:
The underlying model is a DecisionTreeClassifierfrom the sklearn.tree library. In order to optimize it in terms of fairness it was post processed using the ThresholdOptimizer from the fairlearn.postprocessing library. As fairness constraint selection_rate_parity was chosen which matches the selection rates for predefined sensitive features. Sensitive features in the predict method were set to sex and race.
DecisionTreeClassifier:
max_depth=3random_state=1scikit-learn 1.0.2ThresholdOptimizer:
Model type: Postprocessing algorithm. This technique takes as input an existing classifier and a list of sensitive features. Then, it derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints.
Parameters:
estimator=DecisionTreeClassifier(max_depth=3 random_state=1)constraints='selection_rate_parity'objective='balanced_accuracy_scoreDocumentation: fairlearn.postprocessing.ThresholdOptimizer
Version: fairlearn 0.8.0
Source Code: github.com/fairlearn
References:
import fairlearn
import sklearn
print(fairlearn.__version__)
print(sklearn.__version__)
0.8.0 1.0.2
dt_small_fair
ThresholdOptimizer(constraints='selection_rate_parity',
estimator=DecisionTreeClassifier(max_depth=3,
random_state=1),
objective='balanced_accuracy_score')
Primary intended uses:
Primary intended users:
Out of scope:
Relevant factors:
Evaluation factors:
race and sex: the model was constrained on equal selection regarding the intersection of race and genderModel performance measures:
balanced_accuracy_score ( documentation)f1 score would be recommended (measures the tradeoff between precision and recall and is generally more robust than accuracy)Fairness metric:
selection_rate_parity was chosen as a fairness constraint which matches the selection rate across demographic groupsfalse_negative_rate or false_positive_rate can be chosenOverall Performance:
balanced_accuarcy_score: 0.71selection_rate_difference: 0.03pd_most_fair
| Model | Balanced Accuracy | Difference in Selection Rate | |
|---|---|---|---|
| 0 | ThresholdOptimizer(constraints='selection_rate_parity',\n estimator=DecisionTreeClassifier(max_depth=3,\n random_state=1),\n objective='balanced_accuracy_score') | 0.710012 | 0.031483 |
Selection Rate Parity
sex and race intersectionplot_fairness_fair_model(model=dt_small_fair,
model_name="Small DT Fair Model",
sensitive_features_test=intersection_test,
sensitive_features_names="sex and race",
print_report=False,
no_count=True
)
sex, race, native-country, occupationmodel development?**
ThresholdOptimizer to constrain selection rate differencesmaital-status is feature that was important to all models while more intuitve features like occupation were less important. That indicates that the model relies on spurious correlations (e.g. marital-status was highly correlated with relationship_husband and sex)fairlearn library, the model is no longer "sklearn"-standardized. Some libraries in the fair and explainable AI domain require a model to be implemented in a standard library like sklearn. For example I was unable to use the dice_ml framework for counterfactual analysis on the postprocessed model.[1] Molnar, Christoph. Interpretable machine learning (2020)
[4] US Census 2022, https://www.census.gov/quickfacts/fact/table/US
[5] Citizenship and Immigration Statuses of the U.S. Foreign-Born Population (2022) https://sgp.fas.org/crs/homesec/IF11806.pdf